Hierarchical summarization of large documents
نویسندگان
چکیده
summarization models do not take into account the human abstractor's behavior of sentence extraction and only consider the document as a sequence of sentences during the process of extraction of sentences as a summary. In general, a document exhibits a well-defined hierarchical structure that can be described as fractals— mathematical objects with a high degree of redundancy. In this article, we introduce the fractal summarization model based on the fractal theory. The important information is captured from the source document by exploring the hierarchical structure and salient features of the document. A condensed version of the document that is informatively close to the source document is produced iteratively using the contractive transformation in the fractal theory. The fractal summarization model is the first attempt to apply fractal theory to document summariz-ation. It significantly improves the divergence of information coverage of summary and the precision of summary. User evaluations have been conducted. Results have indicated that fractal summarization is promising and out-performs current summarization techniques that do not consider the hierarchical structure of documents.
منابع مشابه
Impact of Document Structure on Hierarchical Summarization
Hierarchical summarization technique summarizes a large document based on the hierarchical structure and salient features of the document. Previous study has shown that hierarchical summarization is a promising technique which can effectively extract the most important information from the source document. Hierarchical summarization has been extended to summarization of multiple documents. Thre...
متن کاملText Summarization Using Cuckoo Search Optimization Algorithm
Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...
متن کاملApplying Formal Concept Analysis to Teaching Material Extraction
Text summarization system can save the time for user when reading large number of documents. The summary of text summarization system usually composed of meaningful sentence which represent content of text. The relations between keyword usually come from their cooccurrences in document. This study using hierarchical clustering method cluster sentences and apply concept formal analysis to find o...
متن کاملHierarchical Summarizing and Evaluating for Web Pages
In this investigation we propose a novel summarization method of Web pages using hierarchical expression. We discuss close relationship between summarization and hierarchical clustering to obtain the results, and we examine how to evaluate hierarchical summarization based on both correlation and structural aspects. We describe some experimental results using NTCIR Web documents to examine our m...
متن کاملA Hybrid Hierarchical Model for Multi-Document Summarization
Scoring sentences in documents given abstract summaries created by humans is important in extractive multi-document summarization. In this paper, we formulate extractive summarization as a two step learning problem building a generative model for pattern discovery and a regression model for inference. We calculate scores for sentences in document clusters based on their latent characteristics u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JASIST
دوره 59 شماره
صفحات -
تاریخ انتشار 2008